The development of object detection in infrared images has attracted more attention\nin recent years. However, there are few studies on multi-scale object detection in infrared street\nscene images. Additionally, the lack of high-quality infrared datasets hinders research into such\nalgorithms. In order to solve these issues, we firstly make a series of modifications based on Faster\nRegion-Convolutional Neural Network (R-CNN). In this paper, a double-layer region proposal\nnetwork (RPN) is proposed to predict proposals of different scales on both fine and coarse feature\nmaps. Secondly, a multi-scale pooling module is introduced into the backbone of the network to\nexplore the response of objects on different scales. Furthermore, the inception4 module and the\nposition sensitive region of interest (ROI) align (PSalign) pooling layer are utilized to explore richer\nfeatures of the objects. Thirdly, this paper proposes instance level data augmentation, which takes into\naccount the imbalance between categories while enlarging dataset. In the training stage, the online\nhard example mining method is utilized to further improve the robustness of the algorithm in\ncomplex environments. The experimental results show that, compared with baseline, our detection\nmethod has state-of-the-art performance.
Loading....